Full-text indexes are special indexes that
efficiently track the words you’re looking for in a table. They help in
enabling special searching functions that differ from regular indexes.
Full-text indexes are not automatically updated, and they reside in a
storage space called a full-text catalog. Full-text indexes are stored in the file system, not in the database. They are, however, administered through the database.
Full-text
catalog files are not recovered during a SQL Server recovery. They also
cannot be backed up and restored by using the T-SQL BACKUP and RESTORE statements. The full-text catalogs must be resynchronized separately after a recovery or restore operation.
Creating and Populating a Catalog
Before
you can begin using full-text capabilities, you must create and
populate a catalog on the server. The catalog will be stored in the
file system. Full-text catalogs must be created on a local hard drive.
You should store the catalog in a folder created solely for that
purpose. It is recommended that you set up a secondary filegroup for
the catalog storage. You could use the following to create the
filegroup and catalog:
ALTER DATABASE AdventureWorks ADD FILEGROUP FullTextCat
GO
ALTER DATABASE AdventureWorks ADD FILE (NAME = N'FullTextCatalog',
FILENAME = N'C:\FullTextCatalogs\FTCat.ndf',
SIZE = 10MB, FILEGROWTH = 5MB)
TO FILEGROUP FullTextCat
GO
CREATE FULLTEXT CATALOG ftCatalog
ON FILEGROUP FullTextCat
IN PATH 'C:\FullTextCatalogs'
AS DEFAULT
GO
With
the storage location set up, you can begin defining and creating the
indexes to be used. These indexes can be set up on any text-based data
stored in the database, including varbinary
data that stores a document. The T-SQL syntax for creating these
indexes has changed since SQL 2000 and is now similar to the following:
CREATE FULLTEXT INDEX
ON Production.Document(
Title, FileName, DocumentSummary, Document TYPE COLUMN FileExtension)
KEY INDEX PK_Document_DocumentID
GO
After
you create an index, you need to administer the population. It is
recommended that you begin with a full population (the default upon
creation) and then schedule population updates periodically afterward.
The frequency of the schedule depends on the frequency of changes
within the data and the latency requirements of the system. You can use
ALTER FULLTEXT INDEX to perform the repopulation.
You can now perform queries by using CONTAINS to search within a single column or CONTAINSTABLE
to search through the entire table. You can use a variety of specialty
search capabilities, including looking for various word forms or
proximate searches for multiple words and other forms of fuzzy
searches. The basic form of a query would look similar to the following:
SELECT * FROM Production.Document
WHERE CONTAINS(DocumentSummary, 'safety')
GO
You can also use FREETEXT or FREETEXTTABLE for more free-form queries. When you use either of these two commands, you can perform a fuzzy search for matches to phrases.